.. _csv_analysis: Working with SOEP data in csv format ************************************** SOEP offers the data in statistical program specific file formats (e.g.: Stata .dta) and also as comma-separated values FIle (csv). With these csvs you can read the non-formatted information directly into a statistical program of your choice. This example shows how to open SOEP data of data version v.36 in csv format with an old Stata version (12) and how to prepare the data in an efficient way. **Create an exercise path with four subfolders:** .. figure:: png/uebungspfade.png :align: center **Example:** - H:/material/exercises/do - H:/material/exercises/output - H:/material/exercises/temp - H:/material/exercises/log These are used to store your script, log files, datasets, and temporary datasets. Open an empty do-file and define the paths you created with globals: .. literalinclude:: docs/import_csv.do :linenos: :lines: 8-16 The global "AVZ" defines the main path. The main paths are subdivided using the globals "MY_IN_PATH", "MY_DO_FILES", "MY_LOG_OUT", "MY_OUT_DATA", "MY_OUT_TEMP". The global "MY_IN_PATH" contains the path to your ordered data. For the following script to work, the global "MY_IN_PATH" must contain the folder path to the SOEP csv files of all datasets. The csv files for each data set should always consist of three csvs. If we want to import and prepare the dataset jugendl in csv format, we need the following csv Files: - jugendl.csv - jugendl_variables.csv - jugendl_values.csv In the SOEP, the csv of each data set contains the variables as columns and their numerical values. Variables and Values csvs contain the variable labels and the value labels for the data set. First some packages for Stata have to be installed so that the process can start. .. literalinclude:: docs/import_csv.do :linenos: :lines: 19-35 Once the packages are installed, you will need to define the following functions to be able to label your dataset later. We define the function soeplabelsvars for linking the variables to the variable labels. .. literalinclude:: docs/import_csv.do :linenos: :lines: 37-61 The soeplabelvals function links the information in the data set with valuelabels. .. literalinclude:: docs/import_csv.do :linenos: :lines: 63-85 After both functions have been loaded we can define in a local the dataset we want to import and prepare as csv. We load the csv via the insheet command. Then we use the defined functions and use the variables.csv and values.csv provided by SOEP to label the data. .. literalinclude:: docs/import_csv.do :linenos: :lines: 87-91 Congratulations you should now have a fully labeled dataset! Last change: |today|